Mining-based File Caching in a Hybrid Storage System

نویسندگان

  • Seongjin Lee
  • Youjip Won
  • Sungwoo Hong
چکیده

In this work, we propose a new mining-based file caching scheme for a hybrid storage disk system. In particular, we focus our efforts on reducing the latency of launching applications. The proposed scheme identifies correlated file accesses in a file access sequence via sequential pattern mining algorithm. Our scheme caches correlated files together to maximize the caching efficiency. The correlated files are extracted from the access patterns through the proposed mining scheme, which consists of three steps: frequent pattern based file extraction, cluster moving gap based file sort, and frequency and size based file prioritization. The extracted correlated files are relocated to an SSD during idle time. DiskSim and NANDSim are used to evaluate the proposed scheme, called Informed Mining. The proposed scheme is compared with a disk only scheme and five other mining based file relocation schemes: Mining based file relocation scheme (Miner), minimum distance based file relocation scheme (Min_Dist), frequency-based relocation scheme (Fre), size-based relocation scheme (Size), and one that relocates files with highest value of (file size * file access number) first to the SSD (Fr*Sz). From the simulation based experiment, launch time is reduced by about 50% using only 10% of sum of all file sizes accessed during a launch of an application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

C-Miner: Mining Block Correlations in Storage Systems

Block correlations are common semantic patterns in storage systems. These correlations can be exploited for improving the effectiveness of storage caching, prefetching, data layout and disk scheduling. Unfortunately, information about block correlations is not available at the storage system level. Previous approaches for discovering file correlations in file systems do not scale well enough to...

متن کامل

Optimizing Hierarchical Storage Management For Database System

Caching is a classical but effective way to improve system performance. To improve system performance, servers, such as database servers and storage servers, contain significant amounts of memory that act as a fast cache. Meanwhile, as new storage devices such as flash-based solid state drives (SSDs) are added to storage systems over time, using the memory cache is not the only way to improve s...

متن کامل

Improve Replica Placement in Content Distribution Networks with Hybrid Technique

The increased using of the Internet and its accelerated growth leads to reduced network bandwidth and the capacity of servers; therefore, the quality of Internet services is unacceptable for users while the efficient and effective delivery of content on the web has an important role to play in improving performance. Content distribution networks were introduced to address this issue. Replicatin...

متن کامل

Accelerate Data Sharing in a Wide-Area Networked File Storage System

Up to now, more and more people use Internet storage services as a new way of sharing. File sharing by a distributed storage system is quite different from a specific sharing application like BitTorrent. And as large file sharing becomes popular, the data transmission rate takes the place of the response delay to be the major factor influencing user experience. This paper introduces strategies ...

متن کامل

Serverless Network File Systems

The paper presents a design for serverless network file system that can dynamically distribute control processing, data storage and caching among a set of cooperating workstations. The aim is to improve performance, scalability and availablility of such a peer-based distributed system. The approach adopted in this paper is to build a monolithic system that handles all aspects of storage, cachin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Inf. Sci. Eng.

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2014